Goto

Collaborating Authors

 practical general-purpose clean-label data poisoning


MetaPoison: Practical General-purpose Clean-label Data Poisoning

Neural Information Processing Systems

Data poisoning---the process by which an attacker takes control of a model by making imperceptible changes to a subset of the training data---is an emerging threat in the context of neural networks. Existing attacks for data poisoning neural networks have relied on hand-crafted heuristics, because solving the poisoning problem directly via bilevel optimization is generally thought of as intractable for deep models. We propose MetaPoison, a first-order method that approximates the bilevel problem via meta-learning and crafts poisons that fool neural networks. MetaPoison is effective: it outperforms previous clean-label poisoning methods by a large margin. MetaPoison is robust: poisoned data made for one model transfer to a variety of victim models with unknown training settings and architectures. MetaPoison is general-purpose, it works not only in fine-tuning scenarios, but also for end-to-end training from scratch, which till now hasn't been feasible for clean-label attacks with deep nets. MetaPoison can achieve arbitrary adversary goals---like using poisons of one class to make a target image don the label of another arbitrarily chosen class. Finally, MetaPoison works in the real-world. We demonstrate for the first time successful data poisoning of models trained on the black-box Google Cloud AutoML API.

  metapoison, name change, practical general-purpose clean-label data poisoning, (3 more...)

Review for NeurIPS paper: MetaPoison: Practical General-purpose Clean-label Data Poisoning

Neural Information Processing Systems

Weaknesses: Aside from experiments and demonstration of data poisoning on some real applications, the paper does not have much novelty. Several previous works have shown that data poisoning has a bilevel formulation and showed how to solve it for simple machine learning models. The first-order method described in the paper to approximately solve the bilevel problem has also been applied to the problem previously by [Munoz-Gonzalez et. The paper uses a few step approximation and reverse mode automatic differentiation coupled with ensembling and reinitialization to solve the problem. As such the algorithm is not new however paper shows that the approach can be used to poison neural networks without handcrafted heuristics like water-marking which were essential in previous works.


Review for NeurIPS paper: MetaPoison: Practical General-purpose Clean-label Data Poisoning

Neural Information Processing Systems

I want to thank the authors for preparing the rebuttal and for sharing their concerns about one of the reviews. This paper was heavily discussed among all the reviewers during the post-rebuttal discussion phase. Given the paper's borderline scores, we also requested additional emergency reviews for this paper -- I hope that the authors will find this additional feedback useful. During the discussion phase, all the reviewers did acknowledge the importance of the proposed practical and scalable method for poisoning the neural networks. Based on the post-rebuttal discussions, some of the reviewers have updated their reviews, providing additional comments.


MetaPoison: Practical General-purpose Clean-label Data Poisoning

Neural Information Processing Systems

Data poisoning---the process by which an attacker takes control of a model by making imperceptible changes to a subset of the training data---is an emerging threat in the context of neural networks. Existing attacks for data poisoning neural networks have relied on hand-crafted heuristics, because solving the poisoning problem directly via bilevel optimization is generally thought of as intractable for deep models. We propose MetaPoison, a first-order method that approximates the bilevel problem via meta-learning and crafts poisons that fool neural networks. MetaPoison is effective: it outperforms previous clean-label poisoning methods by a large margin. MetaPoison is robust: poisoned data made for one model transfer to a variety of victim models with unknown training settings and architectures.

  metapoison, neural network, practical general-purpose clean-label data poisoning